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Abstract 

Using an asymmetric Laplace distribution, which provides a mechanism for Bayesian 
inference of quantile regression models, we develop a fully Bayesian approach to fitting 
single-index models in conditional quantile regression. In this work, we use a Gaussian 
process prior for the unknown nonparametric link function and a Laplace distribu- 
tion on the index vector, with the latter motivated by the recent popularity of the 
Bayesian lasso idea. We design a Markov chain Monte Carlo algorithm for posterior 
inference. Careful consideration of the singularity of the kernel matrix, and tractabil- 
ity of some of the full conditional distributions leads to a partially collapsed approach 
where the nonparametric link function is integrated out in some of the sampling steps. 
Our simulations demonstrate the superior performance of the Bayesian method versus 
the frequentist approach. The method is further illustrated by an application to the 
hurricane data. 

Keywords: Gaussian process prior; Markov chain Monte Carlo; Quantile regression; 
Single-index models. 

1 Introduction 



(Hardle et al. 


( 


1993 


); 



the "curse of dimensionality" in nonparametric problems by assuming that the response is 
only related to a single linear combination of the covariates. Compared to fully nonparamet- 
ric regression, it offers a nice tradeoff between simplicity and modelling power. The fitting of 
single-index models, commonly bas ed on splines or kern el methods, has found wide applica- 
tion in the literature. For example, lHardle et al.l (119931 ) used SIM to study the dependence 



of the sev erity of side impa cts on the velocity and acceleration of the automobile in an acci- 



dent, and Kia et al.l ( 120041 ) demonstrated that SIM provides a good fit in a study trying to 
identify causal factors associated with the prevalence and incidence of depression. However, 
efficient and stable estima tion of SIMs is stil l a challenging prob lem and has inspired many 



recent works in this area ( jWang et al.l ( 120101 ); iLiang et al.l ( 120101 )). 
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Although frequentist estimation of SIMs has a lon g history, the Bay e sian appro ach to 
fitting these models has only appeared quite recently. lAntoniadis et al.l ( 120041 ) and IWang 
( 120091 1 proposed a Bayesian approach usi ng polynomial splines to m odel the nonparametric 
link function, while IChoi et al.l (120 llf) andlGramacv and Lianl (1201 if ) use a Gaussian process 
(GP) prior. As noted in iGramacy and Lianl (1201 if ), one advantage of using GPs as the 
prior for the link function is that the index vector does not have to be normalized to have 
unit norm, which makes the choice of prior easier, and subsequently the sampling algorithm 
simplifies too. 

However, the restriction of these works to mean regression, that is on estimating the 
conditional mean regression function, may be a limitation. As a useful supplement to mean 
regression, quantile regression produces a more complete description of the conditional re- 
sponse distribution. In particular, it can uncover different structural relationships between 
covariates and responses at the upper or lower tails, which is sometimes of significant interest 
in econometrics applications. Furthermore, compared to mean regression, median regression 
(which is a special case of quantile regression) is more robust to outliers or heavy-tailed 
random errors. 

In this article, we consider a single-index quantile regression model. For a given quantile 
level r G (0, 1) and i.i.d. pairs (x iy yi), it is given by 



Qyi\xi( T ) = V(Xi (3) 



1,2, 



n. 



Here y^ is the response, xi 



[Xn, 



x 



ipj 



is the p-dimensional predictor vector, Q yi \ a 



F y \ x X-) i s the inverse cumulative distribution function of the response given the predictors, 
r](.) is the unknown univariate link function, and (3 = /5 2 , ■ • • , P P ) T is the index which 
implicitly depends on the desired quantile r. Dimension reduction is achieved specifically 
by the index vector so that rj is a univariate function instead of p-variate one, as in fully 
nonparametric regression. 

In this paper, we propose a Bay esian treatment of the single-index quantile regression 
model. Recently, IWu et al.l (120 10f ) has considered a similar model using kernel regres- 
sion, which will serve as a frequentist benchmark for our methods. We establish a hier- 
archical Bayesian model by adopting the asymmetric Laplace distribution, which is one 
common approach among a few alternatives in the Bayesian quantile regression literature 



(lYu and Moveedl. l200ll; iKottas and Krnjajid . 120091; iReich et al. 



20101 ; iTokdar and Kadand . 1201 11 ) . Following IGramacy and Lianl (120111 ) , we assign a Gaus 



2010 



Lancaster and Jun 



sian process prior on the link function, to obtain a flexible nonparametric quantile regression 
model. The posterior inference of all parameters is performed via Markov chain Monte Carlo 
computations which automatically incorporates all sources of uncertainty. 

The remainder of the paper proceeds as follows. In Section 2, we describe the structure 
of our hierarchical Bayesian single-index quantile regression model and discuss our prior 
choices. We also consider the posterior sampling algorithm focusing on a more efficient 
partially collapsed sampler, where the link function is integrated out when drawing samples 
of the index vector. This is explained in detail in the Appendix. Then, numerical illustrations 
including simulation studies and a real data example are presented in Section 3. We conclude 
the paper with a discussion in Section 4. 
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2 Hierarchical Bayesian Modelling 

At the r-th quantile, w e mod el the residual errors by th e asymmetr i c Lap lace distribution 
(ALD, lYu and Moveedl fl200lh : iGeraci and Bottail fl2007t ): ILuo et al.l toilh ). More specifi- 
cally, the probability distribution of y given p = r](x T f3) is assumed to be 



r(l-r) 



exp 



■-pAv-p) 



where p T {u) = u(r — I(u < 0)) is the so-called check function, the quantile level r is the 
skewness parameter in the distribution, /i is the location parameter, and a is the scale 
parameter. In our context, with the setting /ij = r](xf(3), and y = (2/1,2/2,- " ,y n ) T , the 
conditional distribution for the observations is 



r n ( l - rf J lA T \ 

— exp I > p T (y t - r){Xi (3)) ) . 

I a U J 



Quantile regression is typically based on minimization of the check loss function. How- 
ever, direct use of the likelihood above is rather inconvenient for Bayes i an in ference. A 
location-scale mixture representation of the ALD fjKozumi and Kobayashil ( 1201 ll )) is helpful 
here. We can write the observations satisfying ([T]) alternatively as 

yi = r]( x If3) + hei + \fh^e i z il 

where ~ exp(l/cr) is an exponential random variable with mean a, Zi is a standard normal 
random variable and is independent of e iy hi = , and hi = T ri_ T \ ■ This suggests treating 
the Ci as latent variables, where the conditional distribution of y is rewritten as 



7r(y\(3, t), a, e n ) = JJ(27rfc 2 o-ei) 1/2 exp 



i=i 



1 



oc exp 



2/c 2 cre 

y-rj n - kie n ) T E~ 1 {y -rj n - k x e n 



(Vi ~ V(xj(3) - he,) 2 ^ 

(det[E})- 1/2 . 



Here e n = (e h e 2 , • • • e n ) T , E = fc 2 crdiag(ei, e 2 , • • • e n ), and rj n = (r] 1 ,--- , 7] n ) T = 
(y(x?0) • • • Mx m f. 

As in I Choi et al.l ( 201ll ) and lGramacy and Lianl ( 201ll ). we model the link function by a 
Gaussian process prior distribution. More specifically, rj is a Gaussian process a priori, with 
zero mean and a squared-exponential covariance function, 



V 



GP(0, C(-,-))> C(x,x') = 7 exp{-(x-x / )74 



(2) 



where 7 and d are hyperparameters. Writing this out in the single-index model framework 
using the observed covariates Xi, we have 



7T 



where C n is an n x n matrix with entries C(x i: Xj) = 7exp{ — (xj(3 — xjf3) 2 /d}. 

In the literature of single-index models, it is well-known that 77 and (3 are unidentifiable 



since r](x T f3) = r] c (x T (c/3)) , c ^ 0, where r\ c 



7](./c) 



and thus (3 is only identifiable up 
to a constant scale. It is typically assumed that \\f3\\ = 1 so that /3 is identifiable u p to 
sign (/3 and — (3 leads to exactly the same model fit). Accordingly, IChoi et al.l (1201 ll ) also 
suppose the suppor t of th e prior distribution for (3 is on the unit sphere. On the other hand, 
Gramacy and Lianl ( 201 ll ) noted that without the constraint \\f3\\ = 1, only (3/\fd is identi- 
fiable when using the Gaussian process prior, and thus one can remove the range parameter 
d and also remove the unit norm constraint on /3, which is mathematically equivalent to 
imposing ||/3|| = 1 and keeping the range parameter d. With the latter approach, we have 
the advantage that the prior on (3 is more easily specified, and one fewer parameter (d) to 
worry about. Note that the model leaves the sign of (3 unidentified under either specification. 
In some cases when (3 is not of direct interest, it does not matter at all. When inference for 



(3 is a primary goal, some simple heuristics for reconciling the signs in iGramacy and Lian 
( 1201 ll ) can be used. 

We therefore adopt the approach in IGramacy and Lianl (120 111 ), so that the entries of 
C n are C(xi, Xj) = 7exp{ — {xj (3 — xj(3) 2 }. Since (3 is not constrained to have unit norm, 
we are free to choose any prior for /3. A typical choice is to put an independent Gaussian 
prior on each comp onent, which is sometimes called a ridg e prior. One can also consider 
the popular g-prior ( iZellnerl . Il986t iGeorge and Fosterl . |2000| ) . A generalization of the ridge 
prior is the so-called Bay e sian la s so, w hich has been of much interest in the recent literature 
f lPark and Casellal (120081 ): lHansI t00$ )). Under this prior, /3j, j = 1, . . . ,p are independent 
and identically Laplace, 

n ( ( 3\a,\) = f[^e- x ^\ A>0. 
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There are a suite of similar priors which share attractive p roperties and yet further gener- 
alize the lasso. Examples include th e norm al-gamma prior (I Griffin and Brown fl2010l)), t he 
Bayesian elastic set (e.g., Li and Lin ( 2010h ) and the horseshoe (e.g., Carvalho et al. ( 2010 )). 
Our reasons for calling these "generalizations" have to do with the form of their hierarchical 
latent variable representations, and corresponding data augmentation Gibbs samplers. We 
focus on the Bayesian lasso as a representative case in this paper. Simplifications (to the 
ridge) and further generalizations to the others are strai ghtforward. Recognizing that esti- 



mators for (3 are not equivariant under such priors (e.g., iPark and Casellal (120081 )). we take 
the common pre-processing step of scaling the inputs Xi to have a unit L2-norm. This also 
simplifies the choice of default priors for A and a. 

To summarize, our Bayesian hierarchical formulation is provided below. 

y\rj n , e n , a ~ N(r] n + k x e n , E), 
?7 n |a;i,7,/3 ~ N(0,C n ), 

(3 ~ n(P\a, A), e 4 exp(l/<r), 
a ~ 7r(cr), A ~ vt(A), 7 ~ 7t(7). 



The hyperpriors for cr, A, 7 are set to be IG(a a ,b a ), Ga(a\,b\) and 7G(a 7 ,6 7 ) 



where IG 
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denotes the inverse Gamma distribution and Ga denotes the Gamma distribution. All of 
the hyperparameters a a , b a , a\, b\, a 7 , 6 7 are set to be 0.5 in all our numerical experiments. 
Sensitivity analyses reveal that our results are not sensitive to these choices. 

The posterior distribution of various variables and parameters is found via MCMC, using 
a partially collapsed version integrating out r] n . The details are left to the Appendix. 



3 Numerical Illustrations 

We present three simulation examples and a real data application to illustrate the proposed 
method, the Bayesian quantile single-index regression model, which is denoted by BQSIM 
for short in the rest of the article. The MCMC algorithm is implemented in R, and available 
upon request. 



3.1 Simulations 



We illustrate the performance of the proposed metho d by comparing i t with a non-Bayesian 
single-index quantile regression approach described by lWu et al. fl2010h . based on kernel esti- 
mation. This frequentist method is denoted by QSIM in the following. Since the frequentist 
approach requires the identifiability constraint ||/3|| = 1, we also normalized the Bayesian 
estimate of the index vector to have unit norm and furthermore require the first component 
of the index vector to be positive to resolv e the sign indet erminacy. The following three 
simulation examples are directly taken from IWu et al. fl2010h . 



Example 1 

Consider data generated from the following single-index model with homoscedastic errors, 

V(t-A)' 



y = r](x (3) + 0.1Z, rj(t) = sin 



C- A 



with (3 = (/3!,/3 2 ,/33) T = ^(1,1, If, A=f-^£7=f + ^,Z~ N(0,1). 
The predictors x = (xi,X2,Xs) T are uniform in [0, l] 3 . We consider sample sizes n = 100 
and n = 200. For each sample size, we fit the models at seven different quantiles r = 
0.1,0.25,0.5,0.75,0.9,0.95,0.99. In each case, the MCMC algorithm is run for 20000 itera- 
tions with a burn-in of 10000. For convergence diagnosis, we present trace plots of f3, a, X 
and 7 in Figure [T] with two different sets of initial values using r = 0.5. The plots suggest 
that the constructed chains mix quickly. 

For a Bayesian point estimator we consider both the posterior mean and posterior median 
based on the sampled values after burn-in. The resulting estimates are summarized in Tables 
H]and[2], together with the sample standard deviation (S.D.) and 2.5% and 97.5% quantiles of 
the sampled values after burn-in. It is seen that both posterior mean and posterior median 
estimators work well and give similar estimates. Thus we only focus on posterior mean as 
our point estimate in the following. 
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Beta_1 Beta_2 Beta_3 




5000 10000 15000 20000 5000 10000 15000 20000 5000 10000 15000 20000 



Figure 1: Trace plots of fix, (3 2} (3%, a, 7 and A at quantile 0.5 for simulation example 1, when 
n = 100. Two chains with different starting values are illustrated. 

Figure [2] displays the boxplots of the estimated index vectors, comparing BQSIM and 
QSIM, with r G {0.1,0.25,0.5,0.75,0.9}. To save space, cases with r = 0.95 and 0.99 are 
not presented. The plots are based on 100 independently generated datasets in each case 
and show that the Bayesian estimates have smaller bias and lower variance. These plots 
generally give the impression that BQSIM produces more precise and stable estimates than 
QSIM. Mean squared errors (MSE) of the estimates based on these 100 replications in each 
case are shown in Table [3] for both sample sizes and all seven quantile levels. 

Figure H] shows the fitted r\ n values (posterior mean) by BQSIM at r = 0.5 on a typical 
run. In the left panel, the fitted r\ n are plotted against the true index xf (3 where f3 is the 
true value in the model. The true link function is also shown on the same figure. In the 
right panel, the fitted r/ n are plotted against the fitted index xjj3, where (3 is the posterior 
mean estimate, leading to visually smoother fitted values. 

Finally, to demonstrate that partial collapsing can significantly improve mixing, we show 
the autocorrelation plots of the (3 series in Figure [3J Observe that the autocorrelation of 
the series produced by the uncollapsed chain is much higher, implying the samples are much 
"stickier". As discussed in the Appendix, C n is nearly singular which caused numerical 
problems when using the uncollapsed Gibbs sampler. To avoid this numerical problem, a 
small artificial nugget effect is added to the matrix (that is we use C n + el in place of C n 
in evaluating the full conditional density, with e = 10~ 5 ). 
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Table 1: Results of BQSIM for simulation example 1, n = 100. 



n = 100 


True 


Mean 


Median 


S.D. 


2.5% 


97.5% 




Pi 


0.5774 


0.5666 


0.5817 


0.0157 


0.5354 


0.5970 


~~ n i n 

T — U . 1 U 


PI 


0.5774 


0.5739 


0.5749 


0.0159 


0.5420 


0.6044 




8 3 


0.5774 


0.5741 


0.5770 


0.0164 


0.5410 


0.6052 




Pi 


0.5774 


0.5785 


0.5766 


0.0154 


0.5481 


0.6084 


T— U.ZO 


a 

P2 


0.5774 


0.5748 


0.5734 


0.0153 


0.5442 


0.6047 




8 3 


0.5774 


0.5773 


0.5783 


0.0155 


0.5466 


0.6071 




Pi 


0.5774 


0.5773 


0.5775 


0.0157 


0.5465 


0.6080 


T — U.OU 


a 
P2 


0.5774 


0.5732 


0.5741 


0.0158 


0.5426 


0.6043 




r'o 


0.5774 


0.5802 


0.5815 


0.0159 


0.5486 


0.6118 




Pi 


0.5774 


0.5756 


0.5755 


0.0156 


0.5455 


0.6067 


T — U . / 


PI 


0.5774 


0.5738 


0.5740 


0.0156 


0.5429 


0.6046 




r'o 


0.5774 


0.5812 


0.5802 


0.0157 


0.5503 


0.6118 




Pi 


0.5774 


0.5760 


0.5751 


0.0157 


0.5453 


0.6071 


„ — n on 

/ — U. c/U 


PI 


0.5774 


0.5751 


0.5762 


0.0152 


0.5458 


0.6053 




P3 


0.5774 


0.5790 


0.5811 


0.0155 


0.5490 


0.6102 




Pi 


0.5774 


0.5819 


0.5823 


0.0159 


0.5511 


0.6127 


r=0.95 


P2 


0.5774 


0.5745 


0.5754 


0.0154 


0.5443 


0.6047 




P3 


0.5774 


0.5730 


0.5735 


0.0157 


0.5432 


0.6045 




Pi 


0.5774 


0.5771 


0.5786 


0.0212 


0.5338 


0.6142 


r=0.99 


P2 


0.5774 


0.5710 


0.5668 


0.0184 


0.5362 


0.6077 




P 3 


0.5774 


0.5729 


0.5802 


0.0225 


0.5304 


0.6124 



Example 2 

Now consider data generated as follows: 

y = 7](x T (3) + v/(sin(a; T /3) + l)Z, where 77(f) = 10 sin(0.75t), 

and j3 = (/3i,/?2) T = 77g(-'-'2) T ! x — (%i,X2) T , and Z is a standard normal random variable. 
The XjS, (j = 1, 2) are drawn identically and independently from a normal distribution with 
mean and variance 0.25 2 . We conduct simulations at r G {0.1, 0.25, 0.5, 0.75, 0.9, 0.95, 0.99} 
with n = 100 and n = 200, each with 100 replications. Figure E] shows the boxplots for the 
estimated index vector and Table 0] reports the mean squared errors. 

Example 3 

Next, we consider a regression model with exponentially distributed errors, 

y = r](x T (3) + £ , r](t) = 5cos(t) + exp(-t 2 ), 

where (3 = (J3 U (3 2 ) T = ^(l,2) T ,a; = (x 1 ,x 2 ) T . x 3 N(0,1), j = 1,2, and S ~ exp(l/2). 
Using the same sample sizes and quantile levels as for the previous two examples, the results 
are presented in Figure [6] and Tabled which again demonstrate the superiority of BQSIM. 
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Table 2: Results of BQSIM for simulation example 1, n = 200. 



n = 200 


True 


Mean 


Median 


S.D. 


2.5% 


97.5% 




0i 


0.5774 


0.5742 


0.5745 


0.0104 


0.5543 


0.5947 


n — n i n 

T — U . 1 u 


PI 


0.5774 


0.5749 


0.5768 


0.0010 


0.5567 


0.5959 




03 


0.5774 


0.5821 


0.5789 


0.0104 


0.5592 


0.6002 




0i 


0.5774 


0.5761 


0.5756 


0.0106 


0.5552 


0.5968 


T — U.ZD 


a 

P2 


0.5774 


0.5764 


0.5782 


0.0105 


0.5562 


0.5972 




8 3 


0.5774 


0.5788 


0.5771 


0.0106 


0.5575 


0.5991 




0i 


0.5774 


0.5771 


0.5779 


0.0106 


0.5561 


0.5971 


T — u.ou 


a 
PI 


0.5774 


0.5760 


0.5779 


0.0106 


0.5561 


0.5976 




r'o 


0.5774 


0.5783 


0.5766 


0.0104 


0.5571 


0.5979 




0i 


0.5774 


0.5769 


0.5752 


0.0107 


0.5567 


0.5987 


i — n 7^ 

T — U. t 


PI 


0.5774 


0.5754 


0.5746 


0.0109 


0.5548 


0.5973 






0.5774 


0.5790 


0.5803 


0.0109 


0.5568 


0.5992 




01 


0.5774 


0.5817 


0.5802 


0.0108 


0.5600 


0.6027 




PI 


0.5774 


0.5816 


0.5802 


0.0098 


0.5630 


0.6017 




03 


0.5774 


0.5675 


0.5644 


0.0106 


0.5471 


0.5881 




01 


0.5774 


0.5754 


0.5766 


0.0109 


0.5546 


0.5959 


r=0.95 


02 


0.5774 


0.5810 


0.5809 


0.0114 


0.5599 


0.6037 




03 


0.5774 


0.5743 


0.5738 


0.0105 


0.5543 


0.5951 




01 


0.5774 


0.5590 


0.5749 


0.0241 


0.5244 


0.6029 


r=0.99 


02 


0.5774 


0.5700 


0.5780 


0.0155 


0.5425 


0.5983 




03 


0.5774 


0.5601 


0.5722 


0.0218 


0.5277 


0.5984 



Example 4 

Here we follow a similar setup as in Example 1, except that the additive errors follow the 
ALD with cr = 0.05. That is we generate data sets from the model ([1]), independently for 
each value of r. This example mainly serves as an illustration that when the errors indeed 
follow ALD, so that estimation of a becomes meaningful, we can indeed estimate its value 
satisfactorily. These results are presented in Tables [6] and for n = 100 and n = 200 
respectively. We emphasize that the estimated a is meaningful only if the true ALD is 
used in estimation. As an illustration of this point, we consider data generated from (TjQ) 
using r = 0.5 and fitted using BQSIM at quantile level r = 0.75. In this case, the index 
vector (3 can still be estimated very close to the true values, while the estimated a is about 
0.01. In general, BQSIM cannot be used to estimate the error distribution directly. This 
limitation is discussed further in Section HJ In this example, the performance at r = 0.99 is 
less satisfactory than in previous examples. 

Example 5 

Finally, we increase the dimension in Example 1 to p = 10 to demonstrate the performance 
in higher dimensions. The only difference in the setup from Example 1 is that we now 
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beta_1 



beta_2 



beta_3 



r = 0.1 



r = 0.25 



r = 0.5 



d 1 



r = 0.75 



r = 0.9 



B100 Q100 B200 Q200 



B100 Q100 B200 Q200 



1100 Q100 B200 Q200 



Figure 2: Summarizing estimators of (3 for n = 100, 200 in example 1. 'B100' ('Q100' 
denotes BQSIM (QSIM) with n = 100, for example. 



set (3 = (1, 1, 1, 0, ... , 0) T and /3 = ^L(l, 1, 1, 1, ... , 1) T , and only consider n = 100. For 
these two cases, the estimation results are shown in Tables [S] and M respectively. To save 
space, boxplots comparing BQSIM and QSIM are not shown now that we are estimating 10 
coefficients. Our methods still perform much better than QSIM in these cases with higher 
dimensions. 
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Table 3: Comparison of MSE for BQSIM (using posterior mean as the estimator) and QSIM 
based on 100 replications in each case for simulation example 1. 









MSE (n = 100) 


MSE (n = 200) 








ft 


ft 


ft 


ft 


ft 


ft 


T= 


0.10 


QSIM 
BQSIM 


0.00239 
0.00041 


0.00447 
0.00051 


0.00886 
0.00045 


0.00346 
0.00017 


0.00473 
0.00023 


0.00052 
0.00020 


T= 


-0.25 


QSIM 
BQSIM 


0.00169 
0.00029 


0.00312 
0.00029 


0.00303 
0.00030 


0.00276 
0.00012 


0.00138 
0.00014 


0.00310 
0.00019 


T= 


:0.50 


QSIM 
BQSIM 


0.00269 
0.00023 


0.00154 
0.00025 


0.00254 
0.00029 


0.00058 
0.00013 


0.00053 
0.00013 


0.00089 
0.00018 


T= 


:0.75 


QSIM 
BQSIM 


0.00340 
0.00026 


0.00601 
0.00029 


0.00291 
0.00038 


0.00039 
0.00017 


0.00028 
0.00016 


0.00048 
0.00016 


T= 


:0.90 


QSIM 
BQSIM 


0.00424 
0.00040 


0.00794 
0.00049 


0.00641 
0.00059 


0.00139 
0.00023 


0.00100 
0.00028 


0.00192 
0.00030 


T= 


:0.95 


QSIM 
BQSIM 


0.00857 
0.00051 


0.00739 
0.00075 


0.00951 
0.00094 


0.00157 
0.00042 


0.00183 
0.00029 


0.00245 
0.00041 


T= 


=0.99 


QSIM 
BQSIM 


0.05034 
0.00083 


0.04259 
0.00098 


0.06166 
0.00125 


0.00494 
0.00062 


0.09708 
0.00192 


0.07046 
0.00099 



Table 4: Comparison of MSE for BQSIM (using posterior mean as the estimator) and QSIM 
based on 100 replications in each case for simulation example 2. 









MSE (n 


= 100) 


MSE (n 


= 200) 








ft 


ft 


ft 


ft 


T- 


=0.10 


QSIM 
BQSIM 


0.02510 
0.00841 


0.02991 
0.00197 


0.00744 
0.00451 


0.00238 
0.00115 


T- 


=0.25 


QSIM 
BQSIM 


0.01997 
0.00401 


0.00396 
0.00095 


0.00823 
0.00220 


0.00645 
0.00059 


T- 


=0.50 


QSIM 
BQSIM 


0.01098 
0.00225 


0.00409 
0.00056 


0.00214 
0.00179 


0.00058 
0.00041 


T- 


=0.75 


QSIM 
BQSIM 


0.02025 
0.00222 


0.00868 
0.00059 


0.00345 
0.00021 


0.00089 
0.00056 


T- 


=0.90 


QSIM 
BQSIM 


0.01983 
0.00393 


0.01227 
0.00104 


0.00584 
0.00393 


0.00196 
0.00104 


T- 


=0.95 


QSIM 
BQSIM 


0.02048 
0.00647 


0.00974 
0.00150 


0.01044 
0.00488 


0.01279 
0.00126 


T- 


=0.99 


QSIM 
BQSIM 


0.06640 
0.00953 


0.04652 
0.00226 


0.05911 
0.00997 


0.03972 
0.00210 
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betal 



betal 




Figure 3: The autocorrelation plots for /3i,/3 2 ,/?3. The left column represents the series 
produced by our partially collapsed sampler and the right column represents the uncollapsed 
sampler. 



3.2 Real data analysis 



Finally, we apply the proposed method to the tropical cyclone (TC) data. Coastal tropical 
cyclones pose a serious threat to social and economic institutions. It is necessary and useful to 
provide a statistical way to analyze the TC data and evaluate the risk of the next catastrophic 
cyclone. Here we consider a dataset consisting of a sample of 422 TCs occurri n g nea r the 
US coastline oyer a 108-year period (1899-2006). Following iJagger and Eisner! ( 120091 ) and 



Bondell et al.l ( 120101 ). we model the wind speeds from TCs with four climate variables: the 
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0.2 0.6 1.0 1.4 



0.2 0.6 1.0 1.4 



Figure 4: The estimated link functions. On the left panel, the values of estimated r\ n are 
plotted against the true index, while for the right panel the values of estimated rj n are plotted 
against the estimated index. 

Table 5: Comparison of MSE for BQSIM (using posterior mean as the estimator) and QSIM 
based on 100 replications in each case for simulation example 3. 









MSE (n 


== 100) 


MSE (n 


== 200) 








Pi 


Pi 


Pi 


P2 


T= 


=0.10 


QSIM 
BQSIM 


0.00767 
0.00033 


0.00447 
0.00008 


0.00174 
0.00018 


0.00114 
0.00005 


T= 


=0.25 


QSIM 
BQSIM 


0.00421 
0.00074 


0.00175 
0.00020 


0.00647 
0.00035 


0.00384 
0.00009 


T= 


=0.50 


QSIM 
BQSIM 


0.00843 
0.00173 


0.00306 
0.00044 


0.00631 
0.00090 


0.00351 
0.00022 


T= 


=0.75 


QSIM 
BQSIM 


0.03992 
0.00221 


0.05156 
0.00050 


0.01429 
0.00573 


0.00690 
0.00098 


T= 


=0.90 


QSIM 
BQSIM 


0.06118 
0.04856 


0.08449 
0.01129 


0.06311 
0.04571 


0.07486 
0.00219 


T= 


=0.95 


QSIM 
BQSIM 


0.07744 
0.07094 


0.05669 
0.05106 


0.09499 
0.08097 


0.17752 
0.05106 


T= 


=0.99 


QSIM 
BQSIM 


0.10262 
0.10064 


0.10510 
0.05996 


0.12993 
0.12071 


0.13203 
0.06277 



North Atlantic Oscillation Index (NAO), the Southern Oscillation Index (SOI), the Atlantic 
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beta_2 



r = 0.1 



t = 0.25 



r = 0.5 



r = 0.75 



r = 0.9 
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Figure 5: Summarizing estimators of (3 for n = 100, 200 in example 2. 



sea-surface temperature (SST) and the average sun spot number (SSN). The values of SOI, 
SST, and SSN are averaged values over the peak season of August through October and the 
value s of NAO are averaged oy er th e preseason and early season months of May and June. 
Both iJagger and Eisner! (120091 ) and iBondell et al.l ( 120101 ) used linear quantile regression to 
analyze this data. Here we apply our BQSIM to analyze how these climate variables influence 
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beta_1 



beta_2 



r = 0.1 



t = 0.25 



r = 0.5 



r = 0.75 



r = 0.9 
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Figure 6: Summarizing estimators of (3 for n = 100, 200 in example 3. 



the wind speeds of TCs. We also fitted QSIM described in IWu et al.l (120101 ) for comparison. 

The particular focus for this type of data is on the upper quantiles, as these extreme 
hurricane-strength storms are of considerable importance. We consider three different quan- 
tile levels r = (0.5,0.75,0.9,0.95,0.99). All covariates are scaled to have mean zero and 
standard deviation one. 
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Table 6: Results of BQSIM for simulation example 4, n — 100. 





n = 100 


True 


Mean 


Median 


S.D. 


2.5% 


97.5% 


MSE 






0i 


0.5774 


0.5805 


0.5818 


0.0353 


0.5084 


0.6476 


0.00117 


T- 


=0.10 


02 


^774 

U.U i i 


n ^81 4 

U . UO 14 


n ^771 

U . U 1 1 ± 


U.UOUO 


n ^1 9 c ! 


U . U4UO 


n nnnfio 

u.uuuuy 




03 


0.5774 


0.5641 


0.5615 


0.0350 


0.4923 


0.6304 


0.00132 






(J 


0.0500 


0.0521 


0.0527 


0.0054 


0.0430 


0.0637 


0.00002 






01 


0.5774 


0.5809 


0.5823 


0.0215 


0.5377 


0.6227 


0.00055 


T- 


=0.25 


02 


^774 
u.u / t *± 


u.u/ uz 


n ^7fi0 
U.u j UU 


o n9i 9 


U.U041 


n f>i 71 

U.Ul 1 1 


u.uuu^o 




03 


0.5774 


0.5723 


0.5704 


0.0218 


0.5290 


0.6145 


0.00063 






a 


0.0500 


0.0520 


0.0520 


0.0054 


0.0427 


0.0638 


0.00003 






01 


0.5774 


0.5787 


0.5786 


0.0176 


0.5432 


0.6129 


0.00030 


T- 


=0.50 


02 


^774 
u.u / t *± 


fl ^740 

U.UI 4U 


U.Ul ov 


n m 7R 




U.UU04 


n ooo9fi 

u.uuuzu 




03 


0.5774 


0.5778 


0.5783 


0.0179 


0.5425 


0.6127 


0.00030 






(T 


0.0500 


0.0514 


0.0515 


0.0053 


0.0421 


0.0631 


0.00003 






01 


0.5774 


0.5750 


0.5765 


0.0208 


0.5346 


0.6165 


0.00041 






02 


^774 
u.u / t *± 


n ^74^; 


U.Ul UU 


o n9i n 


o 

U.uOOU 


n fii ^ 

U.Uluu 


U.UUU4 1 


T- 


=0.75 


Q 

03 


0.5774 


0.5802 


0.5780 


0.0212 


0.5384 


0.6219 


0.00043 






<7 


0.0500 


0.0519 


0.0514 


0.0053 


0.0423 


0.0634 


0.00004 






01 


0.5774 


0.5734 


0.5682 


0.0342 


0.5055 


0.6401 


0.00083 


T- 


=0.90 


02 


^774 

U.U i ) 4 


U .UUOu 




\J .VJOOU 


u.uuzu 




000Q4 




03 


0.5774 


0.5842 


0.5829 


0.0342 


0.5178 


0.6521 


0.00131 






(T 


0.0500 


0.0514 


0.0517 


0.0054 


0.0422 


0.0633 


0.00003 






01 


0.5774 


0.5455 


0.5652 


0.0851 


0.3827 


0.6887 


0.01151 


T- 


=0.95 


02 


0.5774 


0.5611 


5658 


0.0611 


0.4412 


0.6818 


00352 




03 


0.5774 


0.5676 


0.5912 


0.0714 


0.4175 


0.6979 


0.01261 






a 


0.0500 


0.0518 


0.0518 


0.0053 


0.0422 


0.0640 


0.00003 






0i 


0.5774 


0.1115 


0.1589 


0.4957 


-0.7874 


0.8895 


0.28021 


T- 


=0.99 


02 


0.5774 


0.4797 


0.4677 


0.2595 


0.0817 


0.9325 


0.02603 




03 


0.5774 


0.1051 


0.1513 


0.4771 


-0.7418 


0.8867 


0.27783 






a 


0.0500 


0.0529 


0.0530 


0.0054 


0.0433 


0.0646 


0.00003 



Table ITUl compares the obtained index vectors estimated by BQSIM and QSIM and Figure 
[7] shows the estimated quantile curves of TC intensity at different levels. From the table, we 
can see that TC intensity heavily depends on SOI. The index vectors obtained by BQSIM 
and QSIM are qualitatively similar at lower quantiles, with more obvious deviations at levels 
above 0.9. This suggests that the estimates are not reliable for high quantile levels, especially 
for r = 0.95 and r = 0.99. The boxplots in Figure [8] (left columns) show the samples collected 
from the posterior distribution of (3 which are normalized to have unit norm. The histograms 
in Figure [8] show the implied distribution of d (see equation <^). For Gaussian processes, 
larger values of d correspond to smoother functions. While it is generally hard to compare 
the performance of BQSIM and QSIM for real data, our previous simulations suggest that 
BQSIM is more trustworthy. We also performed model fitting on bootstrapped data and 
observed that the BQSIM estimates are more stable across bootstrap samples, except for 
t = 0.99 where estimates obtain from both BQSIM and SIM are quite unstable. 
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Table 7: Results of BQSIM for simulation example 4, n — 200. 





n = 200 


True 


Mean 


Median 


S.D. 


2.5% 


97.5% 


MSE 






0i 


0.5774 


0.5813 


0.5797 


0.0209 


0.5389 


0.6213 


0.00042 


T- 


=0.10 


02 


^774 


fl ^749 

U . O t 4:^ 


U . O I t)U 


D 0991 




n fii fi7 


n 00041 

U.UUU4 1 




03 


0.5774 


0.5742 


0.5753 


0.0217 


0.5301 


0.6159 


0.00041 






(T 


0.0500 


0.0521 


0.0519 


0.0037 


0.0430 


0.0637 


0.00002 






0i 


0.5774 


0.5799 


0.5798 


0.0135 


0.5534 


0.6057 


0.00020 


T- 


=0.25 


02 


^774 




u.oouu 


o m 38 




fl fi098 
u.uuzo 


u.uuuzo 




03 


0.5774 


0.5747 


0.5754 


0.0139 


0.5476 


0.6018 


0.00025 






(T 


0.0500 


0.0515 


0.0528 


0.0037 


0.0427 


0.0638 


0.00002 






01 


0.5774 


0.5816 


0.5830 


0.0119 


0.5584 


0.6045 


0.00030 


T- 


=0.50 


02 


^774 
u.o i t *± 


fl ^748 

U.UI ^iO 


u.o i oo 


u.uiio 


U.OOIU 


u.oyou 


0009fi 
u.uuuzu 




03 


0.5774 


0.5749 


0.5752 


0.0118 


0.5517 


0.5978 


0.00030 






(T 


0.0500 


0.0522 


0.0517 


0.0038 


0.0447 


0.0592 


0.00001 






01 


0.5774 


0.5771 


0.5789 


0.0136 


0.5511 


0.6035 


0.00027 






02 


^774 
u.o i t *± 




fl ^774 


o ni 38 

U.UIOO 


u.oouo 


u.uuoy 


00091 
u.uuuzo 


T- 


=0.75 


03 


0.5774 


0.5770 


0.5751 


0.0135 


0.5505 


0.6029 


0.00022 






<7 


0.0500 


0.0512 


0.0512 


0.0037 


0.0448 


0.0594 


0.00002 






01 


0.5774 


0.5774 


0.5760 


0.0213 


0.5363 


0.6201 


0.00057 


T- 


=0.90 


02 


^774 

U.O i ) 4 


n ^soo 

U .uOUU 


n ^788 

U . O I oo 


f)91 7 


U.OO ( u 


n f!994 


00044 

U.UUU41 




03 


0.5774 


0.5720 


0.5697 


0.0217 


0.5295 


0.6152 


0.00047 






(T 


0.0500 


0.0511 


0.0518 


0.0037 


0.0447 


0.0593 


0.00001 






01 


0.5774 


0.5766 


0.5739 


0.0374 


0.5009 


0.6447 


0.00197 


T- 


=0.95 


02 


0.5774 


0.5795 


0.5841 


0.0337 


0.5128 


0.6471 


00099 




03 


0.5774 


0.5679 


0.5688 


0.0354 


0.5003 


0.6385 


0.00145 






(T 


0.0500 


0.0518 


0.0529 


0.0038 


0.0450 


0.0649 


0.00002 






01 


0.5774 


0.3256 


0.3507 


0.3545 


-0.3724 


0.8083 


0.12130 


T- 


=0.99 


02 


0.5774 


0.5350 


0.5428 


0.1791 


0.2028 


0.8372 


0.02907 




03 


0.5774 


0.3318 


0.3344 


0.2977 


-0.2720 


0.7914 


0.12535 






a 


0.0500 


0.0524 


0.0520 


0.0038 


0.0433 


0.0646 


0.00002 



4 Discussion 

In this article, we have proposed a Bayesian quantile regression method for single-index 
models based on a Gaussian process prior for the unknown link function. As detailed in 
the Appendix, we designed an efficient MCMC algorithm for posterior inference and demon- 
strated the superiority of the Bayesian approach to a modern non-Bayesian one. We carefully 
considered the possibility of marginalizing over the link function in some of the sampling 
steps, which leads to a partially collapsed sampler that balances sampling efficiency and 
implementation expediency. The performance of the proposed approach in our simulations 
is quite encouraging. 

We used zero-mean Gaussian process with squared exponential kernel. It is also possible 
to explicitly incorporate a mean function in the Gaussian process. This could add some 
flexibility to the model. For example, if a linear mean function is used, under independent 
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Table 8: Comparison of MSE for BQSIM and QSIM based on 100 replications in each case 
for simulation example 5, where /3 = (1, 1, 1, 0, ... , 0) T . 





n = 


100 


Pi 


P2 


Pa 


Pa 


p5 


T = 


=0.10 


QSIM 
BQSIM 


0.03051 
0.01000 


0.06458 
0.00251 


0.09402 
0.03926 


0.03894 
0.00068 


0.03484 
0.00460 


T = 


=0.25 


QSIM 
BQSIM 


0.00822 
0.00054 


0.01876 
0.00046 


0.04851 
0.00064 


0.01342 
0.00045 


0.00825 
0.00038 


T = 


=0.50 


QSIM 
BQSIM 


0.01385 
0.00031 


0.00483 
0.00039 


0.00308 
0.00043 


0.00617 
0.00033 


0.00417 
0.00039 


T = 


=0.75 


QSIM 


0.00106 

U.U(JU4o 


0.03489 

A AAAOA 

u. u (j Uoy 


0.03508 

a nnnc o 
U.UUUoo 


0.00102 

A AAAtJO 

(J.UUUoo 


0.00396 

a nnnco 
U.UUUoo 


T = 


=0.90 


BQSIM 


o 00^9 
0.00148 


u.UOUO ( 

0.00106 


0.00116 


01 7^9 
0.00156 


00040 

0.00151 


T = 


=0.95 


BQSIM 


u.uoy ±o 
0.00698 


U.UUO ( 1 

0.00368 


U. 1U 1 uo 

0.00401 


U.UOUU1 

0.00178 


U.UOUiZ 1 

0.00287 


T = 


=0.99 


BQSIM 


U.U^i^^O 

0.02259 


u.uo i yy 
0.01966 


U.UUOU i 

0.03773 


U.UUOIO 

0.02252 


U.UUOIO 

0.01228 




n = 


100 




Py 


Ps 


Pd 


010 


T = 


=0.10 


QSIM 
BQSIM 


0.02318 
0.01572 


0.02520 
0.00628 


0.02052 
0.00214 


0.03534 
0.00194 


0.04236 
0.00432 




=0.25 


QSIM 


0.01096 


0.01086 


0.00622 


0.00749 


0.01302 


T = 


BQSIM 


0.00056 


0.00034 


0.00038 


0.00054 


0.00036 


T = 


=0.50 


QSIM 
BQSIM 


0.00717 
0.00043 


0.01045 
0.00047 


0.00452 
0.00029 


0.00622 
0.00049 


0.00974 
0.00032 


T = 


=0.75 


QSIM 
BQSIM 


0.00117 
0.00035 


0.00386 
0.00055 


0.00157 

/ \ /A/A/A A f\ 

0.00049 


0.01653 
0.00064 


0.00803 
0.00046 


T = 


=0.90 


QSIM 
BQSIM 


0.00766 
0.00118 


0.04293 
0.00167 


0.01118 
0.00065 


0.01324 
0.00133 


0.02452 
0.00181 


T = 


=0.95 


QSIM 
BQSIM 


0.02806 
0.00288 


0.03451 
0.00200 


0.02941 
0.00451 


0.03739 
0.00305 


0.02923 
0.00263 


T = 


=0.99 


QSIM 
BQSIM 


0.08010 
0.01346 


0.06853 
0.01363 


0.05461 
0.01514 


0.06722 
0.01219 


0.05835 
0.01140 



zero-mean normal prior on the linear coefficients, the resulting process would be equivalent 
to a zero- me an Gauss i an pr ocess with an additional quadratic term in the kernel function, 
as shown by iMacKayl (119981 ). Thus the consideration on whether to use a non-zero mean is 
similar to deciding what kind of kernel to use. We take the view that a zero-mean Gaussian 
process with the quadratic exponential kernel function is already flexible enough to model a 
variety of curves and thus do not consider these additional possibilities in modelling. 

It is worth noting that we mainly regard ALD as a tool for estimating the conditional 
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Table 9: Comparison of MSE for BQSIM and QSIM based on 100 replications in each case 
for simulation example 5, where /3 = -775(1, 1,1,1,..., 1) T . 





71 = 


100 


0i 


02 


03 


04 


05 


T— 


-fl in 

-U.1U 


QSIM 
BQSIM 


0.00124 
0.00066 


0.00149 
0.00085 


0.00114 
0.00067 


0.00133 
0.00079 


0.00140 
0.00078 


T — 




QSIM 
BQSIM 


0.00101 
0.00047 


0.00077 
0.00035 


0.00072 
0.00040 


0.00089 
0.00039 


0.00050 
0.00038 


T— 


^0 

-U .rJU 


QSIM 
BQSIM 


0.00086 
0.00048 


0.00062 
0.00032 


0.00367 
0.00026 


0.00061 
0.00048 


0.00567 
0.00055 


T— 


=0.75 


QSIM 
BQSIM 


0.00102 
0.00064 


0.01325 
0.00048 


0.00135 
0.00045 


0.00122 
0.00053 


0.00173 
0.00048 


T- 


=0.90 


QSIM 
BQSIM 


0.00308 
0.00095 


0.00256 
0.00082 


0.01116 
0.00058 


0.00666 
0.00081 


0.00280 
0.00061 


T= 


=0.95 


QSIM 
BQSIM 


0.01535 
0.00129 


0.01194 
0.00144 


0.03278 
0.00125 


0.02291 
0.00162 


0.02617 
0.00129 


T= 


=0.99 


QSIM 
BQSIM 


0.01629 
0.01493 


0.05267 
0.00967 


0.02109 
0.01255 


0.04479 
0.01236 


0.04052 
0.01534 




n = 


100 


06 




08 


09 


010 


T— 


n in 


QSIM 
BQSIM 


0.00123 
0.00096 


0.00160 
0.00086 


0.00141 
0.0065 


0.00151 
0.00107 


0.00100 
0.00066 


T— 




QSIM 
BQSIM 


0.00074 
0.00054 


0.00101 
0.00048 


0.00069 
0.00052 


0.00067 
0.00043 


0.00084 
0.00033 


T— 


-u.ou 


QSIM 
BQSIM 


0.00098 
0.00037 


0.00058 
0.00048 


0.00180 
0.00033 


0.00178 
0.00037 


0.00064 
0.00029 


f— 


=0.75 


QSIM 
BQSIM 


0.00283 
0.00057 


0.00123 
0.00049 


0.00428 
0.00057 


0.00093 
0.00041 


0.00123 
0.00041 


T= 


=0.90 


QSIM 
BQSIM 


0.00385 
0.00070 


0.00258 
0.00071 


0.01294 
0.00079 


0.00964 
0.00105 


0.00341 
0.00064 


T= 


=0.95 


QSIM 
BQSIM 


0.02083 
0.00132 


0.02902 
0.00170 


0.04447 
0.00077 


0.02839 
0.00142 


0.05003 
0.00104 


r= 


=0.99 


QSIM 
BQSIM 


0.04999 
0.01191 


0.04875 
0.01088 


0.07079 
0.01134 


0.06024 
0.02032 


0.05711 
0.01921 



quantile, much like in the frequentist approach. By assuming the errors follow ALD, the 
posterior can give spurious confidence if the underlying data come from a different model 
than ALD. This is like using a quasi-likelihood to replace the true likelihood in frequentist 
estimation. In particular, the error distribution may not be accurately estimated by our 
approach. As we demonstrated, the method does accurately estimate the index vector and 
the link function. Finally we remind the readers that ALD can lead to incoherent inferences 
in the sense that quantile curves are permitted to intersect each other. This is because the 
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Table 10: Estimates from BQSIM and QSIM for the TCs data. 









BQSIM 


QSIM 


T- 


= 0.50 


NAOGSi) 

SST(/3 3 ) 
SSN(/3 4 ) 


0.46300 

0.01665 
0.51756 


0.49633 

-0.06932 
0.47965 


T- 


=0.75 


NAOGSi) 

0(J1{P2 ) 

SST(/3 3 ) 
SSN(/3 4 ) 


0.37840 
-u.o t ouy 
-0.19373 
0.07462 


0.51400 

-u. 1 o^io 

-0.20965 
0.39093 


T- 


=0.90 


NAOG80 

SST(/3 3 ) 
SSN(/? 4 ) 


-0.13451 

-0.40714 
-0.18082 


0.36110 

n fi991 9 

-0.58222 
-0.37594 


T- 


=0.95 


NAOGSi) 

UKJ±yp2 } 

SST(/3 3 ) 
SSN(£ 4 ) 


-0.16689 
-0 90598 
-0.30985 
-0.08797 


0.10274 
-0.11676 
-0.83128 
-0.53364 


T- 


=0.99 


NAO(/30 
SOIOSa) 
SST(/3 3 ) 
SSN(£ 4 ) 


-0.03822 
-0.41609 
0.88491 
0.13593 


0.10860 
-0.12104 
-0.83293 
-0.52894 



inference for distinct quantiles would proceed separately/independently and there is nothing 
to prevent them from overlapping. 

In terms of computational speed, on an ordinary PC, fitting the BQSIM on a single 
generated dataset under our simulation setup would take about 10-20 minutes. This is less 
of a problem for our real data but is quite a burden for simulations. Due to the relatively 
slow computational speed which is a common problem that plagues MCMC algorithms, some 
approximation methods such as variational Bayes might be desirable, but this is outside the 
scope of the current paper. 

As an extension of the current study, one can consider multiple-index models in quantile 
regression. However, sampling the index matrix poses some serious challenges and is outside 
the scope of the current investigation. 
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Figure 7: The estimated link functions by BQSIM (left column) and QSIM (right column), 
for TC intensity at different quantiles. 

Appendix: MCMC algorithm details 

The posterior distribution for all of the unknown parameters and latent variables is propor- 
tional to the joint distribution, given by 
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Figure 8: The boxplots on the left shows the posterior distribution of (3 (after normalization 
to unit norm). The histograms on the right shows the posterior distribution of d = l/\\(3\\ 2 . 
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r; n , e n ,, a, X,j\v) 

(y-Vn- k 1 e n ) T E~ 1 {y -rj n - k x e r 



oc exp 



x (det [E])' 1 ' 2 detlCr 1 / 2 exp | | 
p \ « 1 

j=i i=i 

X (^)°' +leXP {^}(^)° ,+leXP {"7} A ° A " eXP{ ^ A} ' 



The Metropolis-within-Gibbs algorithm may be used to sample from the posterior distri- 
bution. Mathematically speaking, it is possible to integrate out r] n before sampling , and it 



is well-known that marginalization can improve the mixing of the chain ( iLiul . 120081 ) . How- 
ever, if r\ n is not sampled, then the full conditional distributions for and a are no longer 
well-known distributions which leads to extra difficulty in sampling. On the other hand, we 
note the full conditional distribution of f3 is 



7T03K, A, a, y) oc det^]^ exp j- ^M x f[ 



— P -m\/° 

2a 



and thus the evaluation of the density involves the inverse of C n . Unfortunately, since C n is 
a kernel matrix, in many simulations we found it is nearly singular. When r) n is integrated 
out, this singularity problem is avoided since we only have to compute the inverse of the 
matrix C n + E, where E is a diagonal matrix (see ([3]) below). 

The conditional posterior densities of all the parameters and variables, except for (3 and 7, 
are common distributions. The conditional distributions used in the sampling are presented 
below. r\ n is marginalized out when considering the posterior conditional distribution of /3 
and 7. 

n((3\e n ,a, A, 7,?/) oc J n(y\e n , a, r} n )Tr(ri n \l3,j)dri n x n({3\a, A) 

(y - he n ) T (E + C n )-\y - k x e n ) \ 
2 (3) 



oc exp 



v 



x (det[C n + E])- 1 ' 2 x J]V A ^ I/<T , 
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7r(7|A e n , a, A, y) = n("/\(3, e n , a, y) 

oc J 7r(y\e n ,a,r) n )7r(r] ri \{3,~f)dr) n x ir{j) 

(y - k ien ) T (E + C n )-\y - he n ) \ (4) 



oc exp 



2 



x 



{det[C n + E])-^ (^) a7+lex p{-^} 



7r{r) n \/3,e n ,a, A, 7, y) = 7r(r7 n |/3, e n , a, 7, y) ~ iV(/i„,S„), 

li n = C n (C n + E)- 1 (y-k 1 e n ), (5) 
= C n (C n + E) l E, 

A, 7, y) = 7r(o-|77 n , /3 n , e n , A, y) ~ 7G(a ff , z/ CT ), 
« CT = — + p + a<7, 



4 = 1 ^ ' j = l 



7r(A|cr,/3,r7 n ,e n ,7,y) = 7r(A|a,/3) ~ Ga(a 7 + p, 6 7 + \/3j\ /a 2 ). 

i=i 

The full conditional distribution for ej is a generalized inverse Gaussian distribution 
(GIG), 



. 1 /(j/i - r](xi T '/3)) 2 / fc? 2 
7r( ei |a,/3,r7 n , A,7,y) = 7r( ei |cr, rj n , y) ~ GIG \ -, W — , y ' — + - . , 

where the probability density function of GIG(p, m, n) is 

r, 1 \ {n/m) p 1 f l/2-i 2 n 

j{x\p,m,n) = ——— -ar exp < — (m x + n x) 

2K p (mn) [ 2 

a: > 0, —00 <p<oo,m>0,ri>0, 



and is the modified Bessel function of the third kind (jBarndorff-Nielsen and Shephard 



200l|). 
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We use a superscript to denote the sampled values of different quantities at iteration 
t. The variables r]n , A®, 7®, e„' can be directly generated in R based on the respective full 
conditional distributions. For (3^\ we use a Metropolis step with proposal distribution 
N \(3^~ l \ (J 2 pl) , and for 7^, we propose the new value from Xog^^ ~ iV(log7^ -1 \ <x^). In 
practice, erg and <r 7 are manually tuned to ensure the acceptance rate to be within 10% ~ 
30%. This manual tuning is simplified by transforming all predictors and responses to have 
mean and variance 1 before running the MCMC algorithm. 

Our sampling strategy is "partially collapsed" in the sense of Ivan Dyk and Parkl ( 120081 ). 
and in particular is similar to Sampler 7 in that paper. It is easy to see the validity of 
the constructed sampler (that is, it does not change the stationary distribution). More 
specifically, to obtain this partially collapsed sampler by modifying the Gibbs sampler, we 
first marginalize the full conditional distributions for (3 and 7 to get ir(/3, fj n \e n , a, A, 7, y) 
and 7r(7, r) n \(3, e n , a, A, y), by moving rj n from being conditioned to being sampled. Since 
r) n is sampled again in (jSJ) immediately following (3 and 7, the two intermediate 77 n 's are 
redundant and t hus can be trimmed, resu lting in fl3]) and fl4]) respectively. It was shown 
in Theorem 1 of Ivan Dyk and Parkl (120081 ) that the marginalization step can only improve 
the autocorrelation of the chain. As shown in our simulation examples, this improvement is 
dramatic for our specific problem. 
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